Conversation
…ensors for instruments
…ated validations for each instrument
…igher level for scalability
… kernels from the argo vertical movement kernel to enable easier scalability
…operty shorthands
…ensors for instruments
…ated validations for each instrument
…igher level for scalability
…operty shorthands
…p into refactor-sensors
for more information, see https://pre-commit.ci
…p into refactor-sensors
erikvansebille
left a comment
There was a problem hiding this comment.
Looks good! A few comments below
There was a problem hiding this comment.
There seems to still be a lot of duplication between CTD and CTD_BGC. Was the intention for this PR not to remove that duplication/ Or will that come in a next PR?
There was a problem hiding this comment.
Yes, indeed it will come in the next PR... the CTD_BGC will soon be removed but I decided not to do it in this PR so as to keep it focused
| Variable("lifetime", dtype=np.float32), | ||
| ] | ||
| ) | ||
| _DRIFTER_FIXED_VARIABLES = [ |
There was a problem hiding this comment.
Not sure whether FIXED is the right term here (and in other instruments). Suggests they are fixed to the instrument. Would NONSENSOR be a better term? Or do you have another term?
| TEMPERATURE = "TEMPERATURE" | ||
| SALINITY = "SALINITY" | ||
| VELOCITY = "VELOCITY" | ||
| OXYGEN = "OXYGEN" | ||
| CHLOROPHYLL = "CHLOROPHYLL" | ||
| NITRATE = "NITRATE" | ||
| PHOSPHATE = "PHOSPHATE" | ||
| PH = "PH" | ||
| PHYTOPLANKTON = "PHYTOPLANKTON" | ||
| PRIMARY_PRODUCTION = "PRIMARY_PRODUCTION" |
There was a problem hiding this comment.
Why is this list needed? Can we do without? Seems like an extra place to change when adding a sensor..
There was a problem hiding this comment.
I prefer to keep it just because it's a lightweight way of adding a bit of consistency/convention across the codebase and helps with some things like quickly catching typos in the YAML config
| _ST_SENSOR_KERNELS: dict[SensorType, callable] = { | ||
| SensorType.TEMPERATURE: _sample_temperature, | ||
| SensorType.SALINITY: _sample_salinity, | ||
| } |
There was a problem hiding this comment.
Why explicitly mention them here, if they are also part of sensors.py? Seems a duplication?
| import yaml | ||
|
|
||
| from virtualship.errors import InstrumentsConfigError, ScheduleError | ||
| from virtualship.instruments.sensors import ( |
There was a problem hiding this comment.
Why not import all? So that if a new sensor is added, it can automatically also be imported here?
There was a problem hiding this comment.
To be clear, are you suggesting from virtualship.instruments import sensors? I think that would be a good idea :)
| """ | ||
| FieldSet-key → Copernicus-variable mapping for enabled sensors. | ||
|
|
||
| VELOCITY is a special case: one sensor provides two FieldSet variables (U and V). |
There was a problem hiding this comment.
But in Parcels, it should sample fieldset.UV, which is one Field?
There was a problem hiding this comment.
Oops! Overcomplicated things here unnecessarily
| ], | ||
| description=( | ||
| "Sensors fitted to the BGC CTD. " | ||
| "Supported: CHLOROPHYLL, NITRATE, OXYGEN, PH, PHOSPHATE, PHYTOPLANKTON, PRIMARY_PRODUCTION. " |
There was a problem hiding this comment.
Make this list the same order as the list above? Otherwise, more difficult to see whether the list is complete
| @pydantic.field_validator("sensors", mode="after") | ||
| @classmethod | ||
| def _check_sensor_compatibility(cls, value) -> list[SensorConfig]: | ||
| return _check_sensor_compatibility(value, XBT_SUPPORTED_SENSORS, "XBT") | ||
|
|
||
| @pydantic.field_serializer("sensors") | ||
| def _serialize_sensors(self, value: list[SensorConfig], _info): | ||
| return _serialize_sensor_list(value) | ||
|
|
||
| def active_variables(self) -> dict[str, str]: | ||
| """FieldSet-key → Copernicus-variable mapping for enabled sensors.""" | ||
| return build_variables_from_sensors(self.sensors) | ||
|
|
There was a problem hiding this comment.
Why are these functions required for every instrument? Can't they be reused?
| particle_vars=[ | ||
| "U", | ||
| "V", | ||
| ], # two particle variables associated with one sensor |
There was a problem hiding this comment.
Great to add so much more unit testing!
VeckoTheGecko
left a comment
There was a problem hiding this comment.
Good work so far! Really like the direction that this is going. I have a bunch of comments, some quick fixes and others more cenceptual that I think would be good to go over together.
| class _SensorMeta: | ||
| fs_key: str # map to Parcels fieldset variables | ||
| copernicus_var: str # map to Copernicus Marine Service variable names |
There was a problem hiding this comment.
I think it would be good to rename this class to _Sensor, and to include the type in the class itself
| class _SensorMeta: | |
| fs_key: str # map to Parcels fieldset variables | |
| copernicus_var: str # map to Copernicus Marine Service variable names | |
| class _Sensor: | |
| type_: SensorType | |
| fs_key: str # map to Parcels fieldset variables | |
| copernicus_var: str # map to Copernicus Marine Service variable names |
Then the SENSOR_REGISTRY can become:
SENSOR_REGISTRY: dict[SensorType, _Sensor] = {s.type_: s for s in [
... # list of sensors
]} making it easy to look up sensors, but also making it so that if a sensor its chosen it's easy to see what type it is again.
| PRIMARY_PRODUCTION = "PRIMARY_PRODUCTION" | ||
|
|
||
|
|
||
| # per-instrument allowlists of supported sensors (source truth for validation for which sensors each instrument supports) |
There was a problem hiding this comment.
I find it a bit confusing the structure of the classes here - perhaps this is something worth drawing up so that we have some clarity?
We have an Instrument class
@register_instrument(InstrumentType.ADCP)
class ADCPInstrument(Instrument):
"""ADCP instrument class."""
def __init__(self, expedition, from_data):
"""Initialize ADCPInstrument."""
variables = expedition.instruments_config.adcp_config.active_variables()
limit_spec = {
"spatial": True
} # spatial limits; lat/lon constrained to waypoint locations + buffer
super().__init__(
expedition,
variables,
add_bathymetry=False,
allow_time_extrapolation=True,
verbose_progress=False,
spacetime_buffer_size=None,
limit_spec=limit_spec,
from_data=from_data,
)But I feel that there's a lot of details in this class that aren't really related to what an instrument is, but is more glue code to get it working in VirtualShip
From my POV, its best to design things close to the physical domain. An Instrument instance has a set of sensors installed on the machine. These sensors need to match up with a list of allowed sensors (which are defined on the class, e.g., via a method .get_allowed_sensors() and a check in the __init__ that it adheres - this would remove all the _check_sensor_compatibility methods that are currently here). Then each sensor has its own details potentially related to that sensor.
Maybe it would be good for us to sit down and diagram out the abstractions here. Part of that might be brainstorming how the interface between the configs and model code looks like.
|
|
||
|
|
||
| @dataclass(frozen=True) | ||
| class _SensorMeta: |
There was a problem hiding this comment.
Is it possible for us to also add the kernels to this Sensor class? I see a bunch of mappings across the codebase relating the sensors to the kernels (where a lot of them are the same, as Erik mentioned in #328 (comment) ).
I think it would be a big win if we can put the kernels in the sensor class! I don't know how easy it would be since the right kernel might depend on more than just the choice of sensor. If that's the case - what does choosing the right kernel depend on?
| # ===================================================== | ||
| # SECTION: sensor and variable metadata and registries | ||
| # ===================================================== |
There was a problem hiding this comment.
Should we move this to sensors.py? The class and the SENSOR_REGISTRY?
| category: Literal[ | ||
| "phys", "bgc" | ||
| ] # physical vs. biogeochemical variable, used for product ID selection logic | ||
| particle_vars: list[str] # particle variable name(s) produced by this sensor |
There was a problem hiding this comment.
I see that this is only used in
virtualship/src/virtualship/utils.py
Lines 746 to 751 in 89e184a
parcels.Variable objects?
Also, for my own understanding, how does this relate to (e.g.)
virtualship/src/virtualship/instruments/argo_float.py
Lines 38 to 49 in 89e184a
_ARGO_FIXED_VARIABLES are for the base behaviour of the instrument?
| _ADCP_SENSOR_KERNELS: dict[SensorType, callable] = { | ||
| SensorType.VELOCITY: _sample_velocity, | ||
| } |
There was a problem hiding this comment.
This type hint isn't correct
| _ADCP_SENSOR_KERNELS: dict[SensorType, callable] = { | |
| SensorType.VELOCITY: _sample_velocity, | |
| } | |
| from collections.abc.Callable # put at top of file | |
| _ADCP_SENSOR_KERNELS: dict[SensorType, Callable] = { | |
| SensorType.VELOCITY: _sample_velocity, | |
| } |
Great use of typing across the PR - I think having this typing is really useful so that it gets us to think more about the structure of the codebase. Currently I don't think typechecking is run in CI (or if it is, we currently have a lot of failures). Should we map out a path forward for enabling typechecking across the codebase as well? The types are only really useful if they're enforced (something I'm working on in Parcels as well).
| @@ -50,6 +53,11 @@ def _sample_temperature(particle, fieldset, time): | |||
| particle.temperature = fieldset.T[time, particle.depth, particle.lat, particle.lon] | |||
|
|
|||
|
|
|||
There was a problem hiding this comment.
Same here about the callable type annotation - perhaps worth find-all'ing across the codebase for , callable] so all these are replaced
| import yaml | ||
|
|
||
| from virtualship.errors import InstrumentsConfigError, ScheduleError | ||
| from virtualship.instruments.sensors import ( |
There was a problem hiding this comment.
To be clear, are you suggesting from virtualship.instruments import sensors? I think that would be a good idea :)
|
|
||
| def build_particle_class_from_sensors( | ||
| sensors: list[SensorConfig], | ||
| fixed_variables: list, |
There was a problem hiding this comment.
| fixed_variables: list, | |
| fixed_variables: list[Variable], |
| def build_particle_class_from_sensors( | ||
| sensors: list[SensorConfig], | ||
| fixed_variables: list, | ||
| particle_class: type, |
There was a problem hiding this comment.
nitpick. Feel free to ignore this suggestion if you don't find it useful
| particle_class: type, | |
| particle_class: type, # generic type annotation needed for v3 particle class behaviour # TODO: Update with Parcels v4 |
Overview
This PR centralises/abstracts instrument sensor/variable sampling logic so that each instrument declares which sensors it carries (e.g.
TEMPERATURE,SALINITY,VELOCITY). Users can configure which sensors are active for each instrument in the expedition YAML.This helps pave the way for easy addition of BGC sensors to the Argo float in a future PR (#234), and consolidation of CTD + CTD_BGC into a single instrument (#260) with a combined sensor allowlist. Also makes it straightforward to add new sensors to any instrument in the future (e.g., #312, #313), and streamlines them process for developers to add new instruments (i.e. #237)
Major changes
sensors.pyininstruments/defines theSensorTypeclass and per-instrument allowlists (so that there is control over which sensors each instrument supports and users cannot configure unsupported sensors).SensorConfigpydantic model andsensorsfield in every instrument config inexpedition.py.SensorRegistryinutils.pythat maps eachSensorTypeto its FieldSet key, Copernicus variable name, category (phys/bgc), and Parcels particle variable name(s).cycle_phasefor Argo Floats.API change
As mentioned above, the instruments config section of the expedition YAML now has a
sensorslist field, where users specify which sensors are active. For example, below the CTD is configured to sampleTEMPERATUREandSALINITY:sensorslist is omitted, it default to all valid sensors for that instrument.Additional change
Follow-up PRs
planCLI tool will need updating to account for sensor configuration options. Currently not broken but doesn't give option to configure sensors. (New issue to be opened)CTDandCTD_BGCto one instrument #260)Tests